Design of a Speech Corpus for Research on Cross-Lingual Prosody Transfer

نویسندگان

  • Milan Secujski
  • Branislav Gerazov
  • Tamás Gábor Csapó
  • Vlado Delic
  • Philip N. Garner
  • Aleksandar Gjoreski
  • David Guennec
  • Zoran A. Ivanovski
  • Aleksandar Melov
  • Géza Németh
  • Ana Stojkovic
  • György Szaszák
چکیده

Since the prosody of a spoken utterance carries information about its discourse function, salience, and speaker attitude, prosody models and prosody generation modules have played a crucial part in text-tospeech (TTS) synthesis systems from the beginning, especially those set not only on sounding natural, but also on showing emotion or particular speaker intention. Prosody transfer within speech-to-speech translation is a recent research area with increasing importance, with one of its most important research topics being the detection and treatment of salient events, i.e. instances of prominence or focus which do not result from syntactic constraints, but are rather products of semantic or pragmatic level e ects. This paper presents the design and the guidelines for the creation of a multilingual speech corpus containing prosodically rich sentences, ultimately aimed at training statistical prosody models for multilingual prosody transfer in the context of expressive speech synthesis.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Toward transfer of acoustic cues of emphasis across languages

Speech-to-speech (S2S) translation has been of increased interest in the last few years with the research focused mainly on lexical aspects. It has however been widely acknowledged that incorporating other rich information such as expressive prosody contained in speech can enhance the cross-lingual communication experience. Motivated by recent empirical findings showing a positive relation betw...

متن کامل

Cross Lingual Modelling Experiments for Indonesian

The extension of Large Vocabulary Continuous Speech Recognition (LVCSR) to resource poor languages such as Indonesian is hindered by the lack of transcribed acoustic data and appropriate pronunciation lexicons. Research has generally been directed toward establishing robust cross-lingual acoustic models, with the assumption that phonetic lexicons are readily available. This is not the case for ...

متن کامل

Multi-lingual phoneme recognition exploiting acoustic-phonetic similarities of sounds

The aim of this work is to exploit the acoustic-phonetic similarities between several languages. In recent work cross{ language HMM-based phoneme models have been used only for bootstrapping the language{dependent models and the multi{lingual approach has been investigated only on very small speech corpora. In this paper, we introduce a statistical distance measure to determine the similarities...

متن کامل

Multi-lingual Prosodic Processing

In our previous research, we have shown that prosody can be used to dramatically improve the performance of the automatic speech translation system VERBMOBIL [9]. The methods to classify prosodic events have been developed on the German sub-corpus of the VERBMOBIL speech database. In this paper we describe how the methods that we developed on the German sub-corpus can be applied to other langua...

متن کامل

Automatic analysis of prosody for multi - lingual speech corpora . Daniel Hirst

This chapter outlines a general approach and describes a set of tools for the automatic analysis of multilingual speech corpora. Two levels of representation can be derived automatically: a phonetic representation, which provides an extremely close copy of the original speech signal, and a surface phonological representation, which reduces the variability to a small number of discrete values wi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016